A faster alternative to Pandas `isin` function

Posted by user3576212 on Stack Overflow See other posts from Stack Overflow or by user3576212
Published on 2014-05-30T01:06:16Z Indexed on 2014/05/30 3:26 UTC
Read the original article Hit count: 194

Filed under:
|
|

I have a very large data frame df that looks like:

ID       Value1    Value2
1345      3.2      332
1355      2.2      32
2346      1.0      11
3456      8.9      322

And I have a list that contains a subset of IDs ID_list. I need to have a subset of df for the ID contained in ID_list.

Currently, I am using df_sub=df[df.ID.isin(ID_list)] to do it. But it takes a lot time. IDs contained in ID_list doesn't have any pattern, so it's not within certain range. (And I need to apply the same operation to many similar dataframes. I was wondering if there is any faster way to do this. Will it help a lot if make ID as the index?

Thanks!

© Stack Overflow or respective owner

Related posts about python

Related posts about numpy